Primary Output
all_stocks_fundamental_analysis.json.gz
Primary deliverable containing comprehensive analysis for all 2,775 stocks with 86 fields per stock.Secondary Outputs
sector_analytics.json.gz
Sector-level aggregations and analytics.- Format: JSON (gzip compressed)
- Generated by:
process_market_breadth.py - Retention: Permanent
market_breadth.json.gz
Market breadth indicators and relative strength ratings.- Format: JSON (gzip compressed)
- Generated by:
process_market_breadth.py,process_historical_market_breadth.py - Retention: Permanent
ohlcv_data/
Historical OHLCV (Open, High, Low, Close, Volume) data for all stocks.Intermediate Files (Auto-Cleaned)
WhenCLEANUP_INTERMEDIATE = True (default), the following files are automatically deleted after pipeline completion:
Core Data Files
master_isin_map.json
master_isin_map.json
Maps stock symbols to ISIN codes and security IDs.
- Size: ~500 KB
- Used by: All Phase 2 scripts
- Generated by:
fetch_dhan_data.py
dhan_data_response.json
dhan_data_response.json
Full market snapshot with technical indicators for all stocks.
- Size: ~15 MB
- Records: 2,775 stocks
- Generated by:
fetch_dhan_data.py
fundamental_data.json
fundamental_data.json
Quarterly results and financial ratios for all stocks.
- Size: ~35 MB
- Records: 2,775 stocks × quarterly data
- Generated by:
fetch_fundamental_data.py
advanced_indicator_data.json
advanced_indicator_data.json
Pivot points, EMA/SMA signals, and technical sentiment.
- Size: ~8.3 MB
- Generated by:
fetch_advanced_indicators.py
all_company_announcements.json
all_company_announcements.json
Live corporate announcements and regulatory filings.
- Size: ~5-10 MB
- Generated by:
fetch_new_announcements.py
Corporate Actions Files
upcoming_corporate_actions.json
upcoming_corporate_actions.json
Upcoming dividends, bonus issues, splits, and results dates.
- Time Range: Next 2 months
- Generated by:
fetch_corporate_actions.py
history_corporate_actions.json
history_corporate_actions.json
Historical corporate actions.
- Time Range: Last 2 years
- Generated by:
fetch_corporate_actions.py
Market Intelligence Files
bulk_block_deals.json
bulk_block_deals.json
Bulk and block deals from the last 30 days.
- Generated by:
fetch_bulk_block_deals.py
upper_circuit_stocks.json / lower_circuit_stocks.json
upper_circuit_stocks.json / lower_circuit_stocks.json
Stocks hitting circuit limits.
- Generated by:
fetch_circuit_stocks.py
nse_asm_list.json / nse_gsm_list.json
nse_asm_list.json / nse_gsm_list.json
ASM (Additional Surveillance Measure) and GSM (Graded Surveillance Measure) lists.
- Generated by:
fetch_surveillance_lists.py
incremental_price_bands.json
incremental_price_bands.json
Daily price band revisions.
- Generated by:
fetch_incremental_price_bands.py
complete_price_bands.json
complete_price_bands.json
Complete list of all securities with their current price bands.
- Generated by:
fetch_complete_price_bands.py
nse_equity_list.csv
nse_equity_list.csv
NSE listing dates for all equity securities.
- Source: NSE Archives
- Downloaded via: cURL in
run_full_pipeline.py
Directories
company_filings/
company_filings/
Individual filing JSONs for each stock.
- Files:
{SYMBOL}_filings.json(2,775 files) - Total Size: ~100-200 MB
- Content: Top 100 regulatory filings per stock (hybrid from LODR + Legacy endpoints)
- Generated by:
fetch_company_filings.py
market_news/
market_news/
Sentiment-analyzed news for each stock.
- Files:
{SYMBOL}_news.json(2,775 files) - Total Size: ~50-100 MB
- Content: Top 50 news items per stock with AI sentiment (positive/negative/neutral)
- Generated by:
fetch_market_news.py
Base JSON (Replaced by .gz)
all_stocks_fundamental_analysis.json
all_stocks_fundamental_analysis.json
Uncompressed version of the master output.
- Size: ~30-40 MB
- Deleted after: Compression to
.json.gzin Phase 5
Cleanup Behavior
Configuration Flag
What Gets Kept
Compressed Outputs
all_stocks_fundamental_analysis.json.gzsector_analytics.json.gzmarket_breadth.json.gz
OHLCV Data
ohlcv_data/directory (all CSV files)
What Gets Deleted
- All intermediate JSON files (13 files)
company_filings/directorymarket_news/directorynse_equity_list.csv- Uncompressed
.jsonversions
Cleanup Report
The pipeline prints a summary after cleanup:File Size Summary
| Category | Size (Before Cleanup) | Size (After Cleanup) |
|---|---|---|
| Compressed Outputs | ~8-10 MB | ~8-10 MB |
| OHLCV Data | ~500 MB - 2 GB | ~500 MB - 2 GB |
| Intermediate Files | ~200-400 MB | 0 MB |
| Total | ~700 MB - 2.4 GB | ~500 MB - 2 GB |
First run: OHLCV download takes ~30 minutes and generates ~2 GB of historical data.Subsequent runs: OHLCV updates take ~2-5 minutes (incremental sync only).
Standalone Outputs (Optional)
These files are not included in the main pipeline unlessFETCH_OPTIONAL = True:
FNO Data
fno_stocks_response.json- 207 F&O stocksfno_lot_sizes_cleaned.json- Lot sizes for F&O contractsfno_expiry_calendar.json- Expiry dates for futures and options
Indices & ETFs
all_indices_list.json- 194 market indicesetf_data_response.json- 361 ETFs
fetch_all_indices.py, fetch_etf_data.py